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Abstract. This paper studies how to verify the conformity of a pro- 
gram with its specification and proposes a novel constraint-programming 
framework for bounded program verification (CPBPV). The CPBPV 
framework uses constraint stores to represent the specification and the 
program and explores execution paths nondeterministically. The input 
program is partially correct if each constraint store so produced implies 
the post-condition. CPBPV does not explore spurious execution paths 

pj ' as it incrementally prunes execution paths early by detecting that the 

^^ , constraint store is not consistent. CPBPV uses the rich language of con- 

^ ' straint programming to express the constraint store. Finally, CPBPV is 

^ I parametrized with a list of solvers which are tried in sequence, start- 

ing with the least expensive and less general. Experimental results often 
produce orders of magnitude improvements over earlier approaches, run- 
ning times being often independent of the variable domains. Moreover, 

J^ ■ CPBPV was able to detect subtle errors in some programs while other 

QQ ' frameworks based on model checking have failed. 

m ' 

1^ ! 1 Introduction 

o, 

QQ ^ This paper is concerned with software correctness, a critical issue in software en- 

^^ ■ gineering. It proposes a novel constraint-programming framework for bounded 

program verification (CPBPV), i.e., when the program inputs (e.g., the array 
. , ■ lengths and the variable values) are bounded. The goal is to verify the conformity 

r> I of a program with its specification, that is to demonstrate that the specification is 

j^ ■ a consequence of the program. The key idea of CPBPV is to use constraint stores 

to represent the specification and the program, and to non-deterministically 
explore execution paths over these constraint stores. This non-deterministic 
constraint-based symbolic execution incrementally refines the constraint store, 
which initially consists of the precondition. Non-determinism occurs when exe- 
cuting conditional or iterative instructions and the non-deterministic execution 
refines the constraint store by adding constraints coming from conditions and 
from assignments. The input program is partially correct if each constraint store 
produced by the symbolic execution implies the post-condition. It is important 
to emphasize that CPBPV considers programs with complete specifications and 
that verifying the conformity between a program and its specification requires 
to check (explicitly or implicitly) all executables paths. This is not the case in 
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model-checking tools designed to detect violations of some specific property, e.g., 
safety or liveness properties. 

The CPBPV framework has a number of fundamental benefits. First, con- 
trary to earlier work using constraint programming or SMT |2I11I12| . CPBPV 
does not use predicate abstraction or explore spurious execution paths, i.e., 
paths that do not correspond to actual executions over inputs satisfying the 
pre-condition. CPBPV incrementally prunes execution paths early by detecting 
that the constraint store is not consistent. Second, CPBPV uses the rich language 
of constraint programming to express the constraint store, including arbitrary 
logical and threshold combination of constraints, the element constraint, and 
global/combinatorial constraints that express complex relationships on a set of 
variables. Finally, CPBPV is parametrized with a list of solvers which are tried 
in sequence, starting with the least expensive and less general. 

The CPBPV framework was evaluated experimentally on a series of bench- 
marks from program verification. Experimental results of our (slow) prototype 
often produce orders of magnitude improvements over earlier approaches, and 
indicate that the running times are often independent of the variable domains. 
Moreover, CPBPV was able to found subtle errors in some programs that some 
other verification frameworks based on model-checking could not detect. 

The rest of the paper is organized as follows. Section[2]illustrates how CPBPV 
handles constraints store on a motivating example. Section [3] formalizes the 
CPBPV framework for a small programming language and Section |4] discusses 
the implementation issues. Section [5] presents experimental results on a number 
of verification problems, comparing our approach with state of the art model- 
checking based verification frameworks. Section [5] discusses related work in test 
generation, bounded program verification and software model checking. Section 
[7] summarizes the contributions and presents future research directions. 



2 The Constraint-Programming Framework at Work 

This section illustrates the CPBPV verifier on a motivating example, the binary 
search program. CPBPV uses Java programs and JML specifications for the 
pre- and post-conditions, appropriately enhanced to support the expressivity of 
constraint programming. Figure [T] depicts a binary search program to determine 
if a value v is present in a sorted array t. (Note that \result in JML corresponds 
to the value returned by the program). To verify this program, our prototype 
implementation requires a bound on the length of array t, on its elements, and 
on V. We will verify its correctness for specific lengths and simply assume that 
the values are signed integers on a number of bits. 

The initial constraint store of the CPBPV verifier, assuming an input array 
of length 8, is the preconditioi^l Cpre = WO < i < 7 : t°[i] < t°[i + 1] where t° 
is an array of constraint variables capturing the input. The constraint variables 
are annotated with a version number as CPBPV performs a SSA-like renaming 



We omit the domain constraints on the variables for simplicity. 
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/*<& requires (\forall int i; i>=0 M i<t . length-1 ;t [i] <=t [i+1] ) 
<& ensures 

<& (\result != -1 ==> t [\result] == v) M 
(§ (\result == -1 ==> \forall int k; <= k < t . length ; t [k] ! = v) (§*/ 

1 static int binary_search(int [] t, int v) { 

2 int 1=0; 

3 int u = t. length-1; 

4 while (1 <= u) { 

5 int m = (1 + u) / 2; 

6 if (t[m]==v) 

7 return m; 

8 if (t[m] > v) 

9 u = m - 1; 

10 else 

11 1 = m + 1; } // ERROR else u = m - 1; 

12 return -1; } 

Fig. 1. The Binary Search Program 



[To] on the fly since each assignment generates constraints possibly linking the 
old and the new values of the assigned variable. The assignments in lines 2-3 
add the constraints f' = Q /\vP = 7 . CPBPV then considers the loop instruction. 
Since l^ < u°, it enters the loop body, adds the constraint rn" = (Z° + u°)/2, 
which simplifies to m'^ — 3, and considers the conditional statement on line 
6. The execution of the statement is nondeterministic: Indeed, both t°[3] = v° 
and t°[3] ^ w° are consistent with the constraint store, so that the two alterna- 
tives, which give rise to two execution paths, must be explored. Note that these 
two alternatives correspond to actual execution paths in which i[3] in the input 
is equal to, or different from, input v. The first alternative adds the constraint 
i°[3] = v'^ to the store and executes line 7 which adds the constraint result = mP . 
CPBPV has thus obtained an execution path p whose final constraint store Cp 
is: Cpre A ;° = A u'' = 7 A -w}^ = {f + u°)/2 A t^\m^^] ^ v° A result = m° 
CPBPV then checks whether this store Cp implies the post-condition Cpost by 
searching for a solution to Cp A -^Cpost- This test fails, indicating that the com- 
putation path p, which captures the set of actual executions in which t[S\ — v, 
satisfies the specification. CPBPV then explores the other alternatives to the 
conditional statement in line 6. It adds the constraint ^''[to'^] ^ v^ and executes 
the conditional statement in line 8. Once again, this statement is nondetermin- 
istic. Its first alternative assumes that the test holds, generating the constraint 
i'^[7Ti°] > w" and executing the instruction in line 9. Since u is (re-) assigned, 
CPBPV creates a new variable u^ and posts the constraint u^ = viP — 1 = 2. 
The execution returns to line 4, where the test now reads Z° < m^, since CPBPV 
always uses the most recent version for each variable. Since the constraint stores 
entails f' < u^, the only extension to the current path consists of executing line 
5, adding the constraint m^ = {1° + u^)/2, which actually simplifies to m^ = 1. 
Another complete execution path is then obtained by executing lines 6 and 7. 
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Consider now a version of the program in which hnc 11 is replaced by u = 
m-1. To illustrate the CPBPV verifier, we specify partial execution paths by in- 
dicating which alternative is selected for each nondeterministic instruction. For 
instance, {T4, Fq,Ts,T5,Tq) denotes the last execution path discussed above in 
which the true alternative is selected for the first execution of the instruction 
in line 4, the false alternative for the first execution of instruction 6, the true 
alternative for the first instruction of instruction 8, the true alternative of the 
second execution of instruction 5, and the true alternative of the second execu- 
tion of instruction 6. Consider the partial path (T4, Fq, F^) and let us study how 
it can be extended. The partial path (T4, Fq, FgjT^, Tg) is not explored, since it 
produces a constraint store containing 

Cpre A tO[3] ^ v" A fO[3] < yO ^ ^0[i] ^ „0 

which is clearly inconsistent. Similarly, the path (T4, Fq, Fg^T^, Fq^T^) cannot be 
extended. The output of CPBPV on this incorrect program when executed on an 
array of length 8 (with integers coded on 8-bits to make it readable) produces, 
in 0.025 seconds, the counterexample: 
■!J° = -126 A f° = [-128, -127, -126, -125, -124, -123, -122, -121] A result = -1. 

This example highlights a few interesting benefits of CPBPV. 

1 . The verifier only considers paths that correspond to collections of actual in- 
puts (abstracted by constraint stores). The resulting execution paths must all 
be explored since our goal is to prove the partial correctness of the program. 

2. The performance of the verifier is independent of the integer representation 
on this application: it only requires a bound on the length of the array. 

3. The verifier returns a counter-example for debugging the program. 

Note that CBMC and ESC/Java2, two state-of-the-art model checkers fail to 
verify this example as discussed in Section [5] 

3 Formalization of the Framework 

This section formalizes the CPBPV verifier on a small abstract language using a 
small-step SOS semantics. The semantics primarily specifies the execution paths 
over constraint stores explored by the verifier. It features assert and enforce 
constructs which are necessary for modular composition. 

SyntELX Figure [2] depicts the syntax of the programs and the constraints gener- 
ated by the verifier. In the following, we use s, possibly subscripted, to denote 
elements of a syntactic entity S. 

Renamings CPBPV creates variables and arrays of variables "on-the-fly" when 
they are needed. This process resembles an SSA normalization but does not in- 
troduce the join nodes, since the results of different execution paths are not 
merged. Similar renamings are used in model checking. The renaming uses map- 
pings of type y U A —> J\f which maps variables and arrays into a natural numbers 
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L : list of instructions; I : instructions; B : Boolean expressions 
E : integer expressions; A : arrays; V : variables 

L::=I;L\e 

I ::= A[E\ *- E\V ^ E\if B I\ while B I \ assert{B) \ enforce{B) \ return E \ {L} 

B 

B 

E 



= true I false \E>E\E>E\E = E\E^E\E<E\E<E 
= ^B\BAB\B\/B\B^B 

= V I A[E] \E + E\E-E\ E X E\ E/E \ 



C : constraints E : solver expressions 

V^ — {v^ \ V £ V &i i £ A/"} : solver variables 
A'^ = {a^ \ a G A &i i £ Af} : solver arrays 

C :■- true \ false \ E+ > E+ \ E+ > E+ \ E+ ^ E+ \ E+ ^ E+ \ E+ < E+ \ E+ < E^ 

C -.--^C \C AC\C\/C\C^C 

E+ ■■- V I A[E+] \E+ + E+ \E+ -E+ \E+ xE+ \ E+ / E+ \ 

Fig. 2. The Syntax of Programs and Constraints 

denoting their current "version numbers". In the semantics, the version number 
is incremented each time a variable or an array element is assigned. Wc use cr_L 
to denote the uniform mapping to zero (i.e., Vx £ T^ U A : <J_i{x) = 0) and a[x/i] 
the mapping a where x now maps to i, i.e., a[x/i\{y) = ifx = y then i else cr{y). 
These mappings arc used by a polymorphic renaming function p to transform 
program expressions into constraints. For example, p a bi (B b2 = {p a hi) Q) 
{p a &2) (where © £ {A,V,=^}) is the rule used to transform a logical expression. 

Configurations The CPBCV semantics mostly uses configurations of the type 
(/,cr, c), where / is the list of instructions to execute, cr is a version mapping, 
and c is the set of constraints generated so far. It also uses configurations of the 
form (T, a^ c) to denote final states and configurations of the form (_L, cr, c) to 
denote the violation of an assertion. The semantics is specified by rules of the 
fQj-T^j condition s gf^g^i^jjjg ^j^^^ configuration 71 can be rewritten into 72 when the 
conditions hold. 

Conditional Instructions The conditional instruction if b i considers two 
cases. If the constraint Cb associated with b is consistent with the constraint 
store, then the store is augmented with Cf, and the body is executed. If the 
negation -iCb is consistent with the store, then the constraint store is augmented 
with -iCfc. Both rules may apply, since the store may represent some memory 
states satisfying the condition and some violating it. 



! A (p cr 6) is satisfiable c A -i(p cr fe) is satisfiable 



{if b i ; I, cr, c) 1 — > (i ; /, cr, c A (p cr b)) {if b i ; I, a, c) 1 — > {I, cr, c A -i(p a b)) 

Iterative Instructions The while instruction while b i also considers two cases. 
If the constraint Cb associated with b is consistent with the constraint store, then 
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the constraint store is augmented with Cf,, the body is executed, and the while 
instruction is reconsidered. If the negation -iC(, is consistent with the constraint 
store, then the constraint store is augmented with -iCf,. 

c A (p a b) is satisfiable 



{while hi] l,a,c) i — > (i; while hi] /, cr, c A (p cr fe)} 

c A ^(p cr 6) is satisfiable 
{while hi] l,(J,c) i — > {I, a,c A -^{p cr h)) 

Scalar Assignments Scalar assignments create a new constraint variable for 
the program variable to be assigned and add a constraint specifying that the 
variable is equal to the right-hand side. A new renaming mapping is produced. 

172 = a'i[w/cri(w) + 1] & C2 = {p 02 v) = {p CTi e) 

{v ^ e] l,(Ji,ci) I — > {l,a2,ci A C2) 

Assignments of Array Elements The assignment of an array element creates 
a new constraint array, add a constraint for the index being indexed and posts 
constraints specifying that all the new constraint variables in the array are equal 
to their earlier version, except for the element being indexed. Note that the index 
is an expression which may contain variables as well, giving rise to the well-known 
element constraint in constraint programming [25] . 

0-2 = (7i[a/ai{a) + 1] 

C2 = {p CT2 a)[p (Ti ei] = (p CTi 62) 

C3 = Vz G 0.. a. length : [p ai ei) 7^ i ^ {p cr2 a)[i] = {p cti a)[i] 
{a[ei] <— 62, (Ti ; /, ci) 1 — > (I, tT2, ci A C2 A C3) 

Assert Statements An assert statement checks whether the assertion is im- 
plied by the control store in which case it proceeds normally. Otherwise, it ter- 
minates the execution with an error. 

c ^ {p a h) c A -i(p cr 6) is satisfiable 



{assert b ] I, a, c) 1 — > {I, ct, c) {assert b ] I, a, c) 1 — > (_L, a, c) 

Enforce Statements An enforce statement adds a constraint to the constraint 
store if it is satisfiable. 

c A (p (T 6) is satisfiable 



{enforce h ] I, a, c) 1 — > (/, ct, c A (p cr b)) 
Block Statements Block statements simply remove the braces. 

({^i} ; 12,0; c) I — > {h : ^2,0-, c) 
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Return Statements A return statement simply constrains the result variable. 

C2 = (p (7i result) = {p CTi e) 
{return e ; l,ai,Ci) i — > {ai,ci AC2) 

Termination Termination also occurs when no instruction remains. 

{e,cr,c) I > (T,cr,c} 

The CPBPV Semantics Let V be program bpre I bpost in which bpre denotes 
the precondition, I is a list of instructions, and bpost the post-condition. Let ' — > 
be the transitive closure of 1 — >. The final states are specified by the set 

SFN{bpre,P) = { {f,a,c)\{i,a^,p a^ bpre) ^^ *{f,a,c) A /e{±,T}} 

The program violates an assertion if the set 

SFEibpre,P, bpost) = {a,'T,c) G SFNibpre.V)} 

is not empty. It violates its specification if the set 

SFE {bpre, V, bpost) = {T,fT,c) G SFN{bpre,V) | c A {pa ^bpost) satisfiable} 
is not empty. It is partially correct otherwise. 

4 Implementation issues 

The CPBPV framework is parametrized by a list of solvers (5*1, . . . , Sk) which 
are tried in sequence, starting with the least expensive and less general. When 
checking satisfiability, the verifier never tries solver S'i+i, ■ ■ ■ ,Sk if solver 5*^ is 
a decision procedure for the constraint store. If solver Si is not a decision pro- 
cedure, it uses an abstraction a of the constraint store c satisfying c ^ a and 
can still detect failed execution paths quickly. The last solver in the sequence 
is a constraint-programming solver (CP solver) over finite domains which iter- 
ates pruning and searching to find solutions or prove infeasibility. When the CP 
solver makes a choice, the earlier solvers in the sequence are called once again 
to prune the search space or find solutions if they have become decision proce- 
dures. Our prototype implementation uses a sequence {MIP, CP) , where MIP is 
the mixed integer-programming tool ILOG CPLE^sjj and CP is the constraint- 
programming tool Hog JSOLVER. Our Java implementation also performs some 
trivial simplifications such as constant propagation but is otherwise not opti- 
mized in its use of the solvers and in its renaming process whose speed and 
memory usage could be improved substantially. Practically, simplifications are 
done on the fly and the MIP solver is called at each node of the executable 
paths. The CP solver is only called at the end of the executable paths when 



See http://www.ilog.com/productsl 
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the complete post condition is considered. Currently, the implementation use a 
depth-first strategy for the CP solver, but modern CP languages now offer high- 
level abstractions to implement other exploration strategics. In practice, when 
CPBPV is used for model checking as discussed below, it is probably advisable 
to use a depth-first iterative deepening implementation. 

5 Experimental results 

In this section, we report experimental results for a set of traditional benchmarks 
for program verification. We compare CPBVP with the following frameworks: 

— ESC/Java is an Extended Static Checker for Java to find common run-time 
errors in JML-annotated Java programs by static analysis of the code and 



its annotations. See http://kind.ucd.ie/products/opensource/ESCJava2/ 



CBMC is a Bounded Model Checker for ANSI-C and C-|-|- programs. It al- 
lows for the verification of array bounds (buffer overflows), pointer safety, ex- 



ceptions, and user-specified assertions. See lhttp://www. cprover.org/cbmc/ 



BLAST, the Berkeley Lazy Abstraction Software Verification Tool, is a soft- 
ware model checker for C programs. See|http://mtc.epfl.ch/software-tools/blast/ 



EUREKA is a C bounded model checker which uses an SMT solver instead 
of an SAT solver. See http://www.ai-lab.it/eureka/ 

Why is a software verification platform which integrates many existing provers 
(proof assistants such as Coq, PVS, HOL 4,...) and decision procedures such 



as Simplify, Yices, ...). See jhttp://why.lri.fr/ 



Of course, neither the expressiveness nor the objectives of all these systems are 
the same as the one of CPBPV. For instance, some of them can handle CTL/LTL 
constraints whereas CPBPV dos not yet support this kind of constraints. Nev- 
ertheless, this comparison is useful to illustrate the capabilities of CPBPV. 
All experiments were performed on the same machine, an Intel(R) Pentium(R) 
M processor 1.86GHz with 1.5G of memory, using the version of the verifiers 
that can be downloaded from their web sites (except for EUREKA for which the 
execution times given in [213] are reported.) For each benchmark program, we de- 
scribe the data entries and the verification parameters. In the tables, "UNABLE" 
means that the corresponding framework is unable to validate the program ei- 
ther because a lack of expressiveness or because of time or memory limitations, 
"NOT_FOUND" that it does not detect an error, and "FALSE_ERROR" that 
it reports an error in a correct program. Complete details of the experiments, 
including input files and error traces, can be found in [13j . 

Binary search We start with the binary search program presented in figure [TJ 
ESC/Java is applied on the program described in Figure [T] ESC/Java requires a 
limit on the number of loop unfoldings, which we set to log{n) + 1 which is the 
worst case complexity of binary search algorithm for an array of length n. Sim- 
ilarly, CBMC requires an overestimate of the number of loop unfoldings. Since 
CBMC does not support first-order expressions such as JML \forall statement. 
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CPBPV 


array length 


8 


16 


32 


64 


128 


256 


time 


f.OSls 


1.69s 


4.043s 


17.009s 


136.80s 


1731.696s 


CBMC 


array length 


8 


16 


32 


64 


128 


256 


time 


1.37s 


1.43s 


UNABLE 


UNABLE 


UNABLE 


UNABLE 


Wfiy 


with invariant 


11.18s 


without invariant 


UNABLE 


ESC/Java 


FALSE_ERROR 


BLAST 


UNABLE 



Table 1. Comparison tabic for binary search 

we generated a C program for each instance of the problem (i.e., each array 
length). For example, the postcondition for an array of length 8 is given by 



(result ! = 
Cresult== 



-1 tit a[result]==x) I I 

-1 M (a[0] !=x&j!a[l] !=xMa[2] ! 



=xMa[3] !=xMa[4] !=xMa[5] !=xMa[6] !=xMa[7] !=x) 



For the Why framework, we used the binary search version given in their distri- 
bution. This program uses an assert statement to give a loop invariant. 

Note that CPBPV docs not require any additional information: no invariant 
and no limits on loop unfoldings. During execution, it selects a path by nonde- 
terministically applying the semantic rules for conditional and loop expressions. 

Table [1] reports the experimental results. Execution times for CPBPV are 
reported as a function of the array length for integers coded on 31 bits|£| Our 
implementation is neither optimized for time or space at this stage and times 
are only given to demonstrate the feasibility of the CPBPV verifier. 

The "Why" framework [TH] was unable to verify the correctness without the 
loop invariant; 60% of the proof obligations remained unknown. 

The CBMC framework was not able to do the verification for an instance of 
length 32 (it was interrupted after 6691,87s). 

ESC/Java was unable to verify the correctness of this program unless com- 
plete loop invariants are provided |f|. 



An Incorrect Binary search Table [2] reports experimental results for an in- 
correct binary search program (see Figure [U hnc 11) for CPBPV, ESC/Java, 
CBMC, and Why using an invariant. The error trace found with CPBPV has 
been described in Section[21 The error traces provided by CBMC and ESC/Java 
only show the decisions taken along the faulty path can be found in [13] . In con- 
trast to CPBPV, they do not provide any value for the array nor the searched 
data. Observe that CPBPV provides orders of magnitude improvements in effi- 
ciency over CBMC and also outperforms ESC/Java by almost a factor 8 on the 
largest instance. 



^ The commercial MIP solver fails with 32-bit domains because of scaling issues. 
® a version with loop invariants that allows to show the correctness of this program 
has been written by David Cok, a developper of ESC/Java, after we contacted him. 
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CPBPV 


ESC/Java 


CBMC 


WHY with invariant 


BLAST 


length 8 


0.027s 


1.21 s 


1.38s 


NOT.FOUND 


UNABLE 


length 16 


0.037s 


1.347 s 


1.69s 


NOT.FOUND 


UNABLE 


length 32 


0.064s 


1.792 s 


7.62s 


NOT.FOUND 


UNABLE 


length 64 


0.115s 


1.886 s 


27.05s 


NOT.FOUND 


UNABLE 


length 128 


0.241s 


1.964 s 


189.20s 


NOT_FOUND 


UNABLE 



Table 2. Experimental Results for an Incorrect Binary Search 





CPBPV 


ESC/Java 


CBMC Why 


BLAST 


time 


0.287s 


1.828s 


0.82s 


8.85s UNABLE 



Table 3. Experimental Results on the Tritype Program 



The Tritype Program The tritype program is a standard benchmark in test 
case generation and program verification since it contains numerous non-feasible 
paths: only 10 paths correspond to actual inputs because of complex conditional 
statements in the program. The program takes three positive integers as inputs 
(the triangle sides) and returns 2 if the inputs correspond to an isosceles triangle, 
3 if they correspond to an equilateral triangle, 1 if they correspond to some other 
triangle, and 4 otherwise. The tritype program in Java with its specification in 
JML can be found in[l3|. Table [3] depicts the experimental results for CPBPV, 
ESC/Java, CBMC, BLAST and Why. BLAST was unable to validate this ex- 
ample because the current version does not handle linear arithmetic. Observe 
the excellent performance of CPBPV and note that our previous approach us- 
ing constraint programming and Boolean abstraction to abstract the conditions, 
validated this benchmark in 8.52 seconds when integers were coded on 16 bits 
[12]. It also explored 92 spurious paths. 



An Incorrect Tritype Program Consider now an incorrect version of Tritype 
program in which the test "if ((trityp==2)&&(i+k>j))" in line 22 (see jT3]) 
is replaced by "if ((trityp==l)&&(i+k>j))". Since the local variable trityp is 
equal to 2 when i==k, the condition (i+k)>j implies that (i,j,k) are the sides 
of an isosceles triangle (the two other triangular inequalities are trivial because 
j>0). But. when trityp=l, i==j holds and this incorrect version may answer 
that the triangle is isosceles while it may not be a triangle at all. For example, 
it will return ^ when (i,j,k)=(l,l,2). Table |4] depicts the experimental results. 
Execution times correspond to the time required to find the first error. The error 
found with CPBPV corresponds to input values {i,j,k) = (1,1,2) mentioned 
earlier. Once again, observe the excellent behavior of CPBPV compared to the 
remaining tools, [j 



^ For CBMC, we have contacted D. Kroening who has recommended to use the option 
CPROVER_assert. If we do so, CBMC is able to find the error, but we must add 
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fl 





CPBPV 


ESC/Java 


CBMC 


WHY 1 


time 0.056s s 


f.853s 


NOT_FOUND NOT_FOUND 



Table 4. Experimental Results for the Incorrect Tritype Program 





CPBPV 


ESC/Java 


CBMC 


EUREKA 


length 8 


f.45s 


3.778 s 


f.ffs 


9fs 


length f6 


2.97s 


UNABLE 


2.0fs 


UNABLE 


length 32 


UNABLE 


UNABLE 


6.f0s 


UNABLE 


length 64 


UNABLE 


UNABLE 


37.65s 


UNABLE 



Table 5. Experimental Results for Bubble Sort 



Bubble Sort with initial condition This benchmark (see [T3]) is taken from 
[2] and performs a bubble sort of an array t which contains integers from to 
t.length given in decreasing order. Table[S]shows the comparative results for this 
benchmark. CPBPV was limited on this benchmark because its recursive imple- 
mentation uses up all the JAVA stack space. This problem should be remedied 
by removing recursion in CPBPV. 



Selection Sort We now present a benchmark to highlight both modular veri- 
fication and the element constraint of constraint programming to index arrays 
with arbitrary expressions. The benchmark described in [13j . Assume that func- 
tion f indMin has been verified for arbitrary integers. When encountering a call 
to f indMin, CPBPV first checks if its precondition is entailed by the constraint 
store, which requires a consistency check of the constraint store with respect to 
the negation of the precondition. Then CPBPV replaces the call by the post- 
condition where the formal parameters are replaced by the actual variables. In 
particular, for the first iteration of the loop and an array length of 40, CPBPV 
generates the conjunction < fc° < 40 A t"[k"] < t"[0] A ... A t"[k°] < f°[39] 
which features element constraint j25j . Indeed, fc*^ is a variable and a constraint 
like t"[fc"] < t"[0] indexes the array t° of variables using fc". 

The modular verification of the selection sort explores only a single path, is 
independent of the integer representation, and takes less than 0.01s for arrays 
of size 40. The bottleneck in verifying selection sort is the validation of function 
f indMin, which requires the exploration of many paths. However the complete 
validation of selection sort takes less than 4 seconds for an array of length 6. Once 
again, this should be contrasted with the model-checking approach of Eureka 
[2] . On a version of selection sort where all variables are assigned specific values 
(contrary to our verification which makes no assumptions on the inputs). Eureka 
takes 104 seconds on a faster machine. Reference [2] also reports that CBMC 



some assumptions to mean that there is no overflow into the sums, in order to prove 
the correct version of tritype with this same option. 
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takes 432.6 seconds, that BLAST cannot solve this problem, and that SATABS 
[9] only verifies the program for an array with 2 elements. 

Sum of Squares Our last benchmark is described in [12] and computes the 
sum of the square of the n first integers stored in an array. The precondition 
states that n is the size of the array and that t must contain any possible 
permutation of the n first integers. The postcondition states that the result 
is n X (n + 1) X (2 X 71 + l)/6. The benchmark illustrates two functionalities 
of constraint programming: the ability of specifying combinatorial constraints 
and of solving nonlinear problems. The alldif f erent constraint [23] in the pre- 
condition specifies that all the elements of the array are different, while the 
program constraints and postcondition involves quadratic and cubic constraints. 
The maximum instance that we were able to solve with CPBPV was an array 
of size 10 in 66.179s. 

CPLEX. the MIP solver, plays a key role in all these benchmarks. For in- 
stance, the CP solver is never called in the Tritype benchmark. For the Binary 
search benchmark, there are length calls to the CP solver but almost 75% of the 
CPU time is spent in the CP solver. Since there is only path in the Buble sort 
benchmark, the CP solver is only called once. In the Sum of squares example, 
80% of the CPU time is spent in the CP solver. 

6 Discussion and Related Work 

We briefly review recent work in constraint programming and model checking 
for software testing, validation, and verification. We outline the main differences 
between our CPBPV framework and existing approaches. 

Constraint Logic Programming Constraint logic programming (CLP) was 
used for test generation of programs (e.g., [17l20l24fTO] ) and provides a nice 
implementation tool extending symbolic execution techniques [4]. Gotlieb et al. 
showed how to represent imperative programs as constraint logic programs and 
used predicate abstraction (from model checking) and conditional constraints 
within a CLP framework. Flanagan [15] formalized the translation of imperative 
programs into CLP, argued that it could be used for bounded model checking, 
but did not provide an implementation. The test-generation methodology was 
generalized and applied to bounded program verification in [11112] . The imple- 
mentation used dedicated predicate abstractions to reduce the exploration of 
spurious execution paths. However, as shown in the paper, the CPBPV veri- 
fier is significantly more efficient and often avoids the generation of spurious 
execution paths completely. 

Model Checking It is also useful to contrast the CPBPV verifier with model- 
checking of software systems. SAT-based bounded model checking for software[B| 
consists in building a propositional formula whose models correspond to exe- 
cution paths of bounded length violating some properties and in using SAT 
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solvers to check whether the resuhing formula is satisfiable. SAT-based model- 
checking platforms [6] have been widely popular thanks to significant progress 
in SAT solvers. A fundamental issue faced by model checkers is the state space 
explosion of the resulting model. Various techniques have been proposed to ad- 
dress this challenge, including generalized symbohc execution (e.g., [21]), SMT- 
based model checking, and abstraction/refinement techniques. SMT-based model 
checking is the idea of representing and checking quantifier-free formulas in a 
more general decidable theory (e.g. [18|14|22] ). These SMT solvers integrate 
dedicated solvers and share some of the motivations of constraint programming. 
Predicate abstraction is another popular technique to address the state space 
explosion. The idea consists in abstracting the program to obtain an abstract 
program on which model checking is performed. The model checker may then 
generate an abstract counterexample which must be checked to determine if it 
corresponds to a concrete execution path. If the counterexample is spurious, the 
abstract program is refined and the process is iterated. A successful predicate 
abstraction consists of abstracting the concrete program into a Boolean program 
(e.g., [5|7|8j ). In recent work |3|2j . Armando & al proposed to abstract concrete 
programs into linear programs and used an abstraction of sets of variables and 
array indices. They showed that their tool compares favourably and, on some 
of the programs considered in this paper, outperforms model checkers based on 
predicate abstraction. 

Our CPBPV verifier contrasts with SAT-based model checkers, SMT-based model 
checkers and predicate abstraction based approaches: It does not abstract the 
program and does not generate spurious execution paths. Instead it uses a 
constraint-solver and nondeterministic exploration to incrementally construct 
abstractions of execution paths. The abstraction uses constraint stores to rep- 
resent sets of concrete stores. On many bounded verification benchmarks, our 
preliminary experimental results show significant improvements over the state- 
of-the-art results in [2]. Model checking is well adapted to check low- level C 
program and hardware applications with numerous Boolean constraints and bit- 
wise operations: It was successfully used to compare an ANSI C program with a 
circuit given as design in Verilog [7]. However, it is important to observe that in 
model checking, one is typically interested in checking some specific properties 
such as buffer overflows, pointer safety, or user-specified assertions. These prop- 
erties are typically much less detailed than our post-conditions and abstracting 
the program may speed up the process significantly. In our CPBPV verifier, it 
is critical to explore all execution paths and the main issue is how to effectively 
abstract memory stores by constraints and how to check satisfiability incremen- 
tally. It is an intriguing issue to determine whether an hybridization of the two 
approaches would be beneficial for model checking, an issue briefly discussed in 
the next section. Observe also that this research provides convincing evidence of 
the benefits of Nieuwenhuis' challenge [22] aiming at extending SMTlf] with CP 
techniques. 



See also [T] for a study of the relations between constraint programming and Satis- 
fiability Modulo Theories (SMT) 
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7 Perspectives and Future Work 

This paper introduced the CPBPV framework for bounded program verification. 
Its novelty is to use constraints to represent sets of memory stores and to explore 
execution paths over these constraint stores nondeterministically and incremen- 
tally. The CPBPV verifier exploits the fact that, when variables and arrays are 
bounded, the constraint store can always be checked for feasibility. As a result, it 
never explores spurious execution path contrary to earlier approaches combining 
constraint programming and predicate abstraction [11I12J or integrating SMT 
solvers and the abstraction/refinement approach from model checking [2]. We 
demonstrated the CPBPV verifier on a number of standard benchmarks from 
model checking and program checking as well as on nonlinear programs and 
functions using complex array indexings, and showed how to perform modular 
verification. The experimental results demonstrate the potential of the approach: 
The CPBPV verifier provides significant gain in performance and functionalities 
compared to other tools. 

Our current work aims at improving and generalizing the framework and im- 
plementation. In particular, we would like to include tailored, light-weight solvers 
for a variety of constraint classes, the optimization of the array implementation, 
and the integration of Java objects and references. There arc also many research 
avenues opened by this research, two of which are reviewed now. 

Currently, the CPBPV verifier does not check for variable overflows: the 
constraint store enforces that variables take values inside their domains and ex- 
ecution paths violating these constraints are thus not considered. It is possible 
to generalize the CPBPV verifier to check overfiows as the verification proceeds. 
The key idea is to check before each assignment if the constraint store entails that 
the value produced fits in the selected integer representation and generate an 
error otherwise. (Similar assertions must in fact be checked for each subexpres- 
sion in the right hand-side in the language evaluation order. Interval techniques 
on floats [4] may be used to obtain conservative checking of such assertions. 

An intriguing direction is to use the CPBPV approach for properties check- 
ing. Given an assertion to be verified, one may perform a backward execution 
from the assertion to the function entry point. The negation of the assertion is 
now the pre-condition and the pre-condition becomes the post-condition. This 
requires to specify inverse renaming and executions of conditional and iterative 
statements but these have already been studied in the context of test generation. 

Acknov^rledgements Many thanks to Jcan-Franois Couchot for many helps on 
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